Identifying Topic and Focus by an Automatic Procedure

نویسندگان

  • Eva Hajicová
  • Petr Sgall
  • Hana Skoumalová
چکیده

An algorithm for automatic identification of topic and focus of the sentence is presented, based on dependency syntax and using written input, which is much more ambiguous than spoken utterance. 1. The dichotomy of topic and focus, based, in the Praguean Functional Generative Description, on the scale of communicative dynamism (underlying word order), is relevant not only for a possible placement of the sentence in a context, but also for its semantic interpretation. The underlying word order differs from • the surface one especially in that the verb stands m o r e t o the right than all its complementations belonging to the topic of the sentence (or to the local topic of the clause headed by the verb), and more to the left than those belonging to the focus. Using a dependency grammar (or, more or less equivalently, a flat structure in a constituency based grammar), we can illustrate this by the following example, where (1') is a simplified underlying representation of (1) on a reading answering e.g. the question Where has Charles found my pen ?: (1) Charles has found your pen in a box lying on the table. (1') (Charles)Act ((you)App,a pen)Obj find.Pelf Ceox.Indef ((Rel)Act lie (table)L~.o,)c~o, )L~.~, In (1') every pair of parentheses encompasses a dependent item (i.e. corresponds to an edge of the linearized dependency tree), the indices of parentheses denote kinds of dependency (valency slots, or theta roles and adjuncts): Act stands for Actor (underlying Subject), Appurt for Appurtenance (Possessivity in a broader sense), Obj for Objective (underlying ~Object), Loc for Locative, Gener for the General Relationship (of an adjunct to its head); the other indices denote values of morphological categories (Perfect , Indefiniteness) and of adverbial prepositions (in, on), Rel denotes a relative pronoun (here

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A review of text mining approaches and their function in discovering and extracting a topic

Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling.  Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...

متن کامل

Automatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation

Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...

متن کامل

Identifying the challenges to good clinical rounds: A focus-group study of medical teachers

Introduction: The use of clinical rounds, as an integral part ofclinical teaching to help medical students acquire essential skillsof practicing medicine, is critically important. An understandingof medical teachers’ perceptions concerning the challenges ofclinical rounds can help identify the key areas of focus to betterfoster professional development of medical students. This studyexplored th...

متن کامل

An Automatic Procedure for Topic-Focus Identification

The dichotomy of topic and focus, based, in the Praguean Functional Generative Description, on the scale of communicative dynamism, is relevant not only for a possible placement of the sentence in a context, but also for its semantic interpretation. An automatic identification of topic and focus may use the input information on word order, on the systemic ordering of kinds of complementations (...

متن کامل

Automatic Pavement Crack Detection Based on Aerial Imagery

Road health information is an important indicator for assessing the status of the road in management systems. Identifying the abandonment of surfaces is an important process in maintaining roads and traffic safety, which is traditionally conducted on the basis of field surveys. Today, remote sensing methods, especially photogrammetric imaging, are presented. In this article, based on by UAVs im...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1993